Chapter 2: Time Series Graphics

Translating fpp3 Chapter 2 examples from R to Clojure using tablecloth, tablecloth.time, and tableplot.

Reference: https://otexts.com/fpp3/graphics.html

Run with: clj -A:notebooks then evaluate in your editor, or render with Clay: (clay/make! {:source-path "notebooks/chapter_02_time_series_graphics.clj"})

(ns chapter-02-time-series-graphics
  (:require [tablecloth.api :as tc]
            [tablecloth.column.api :as tcc]
            [tech.v3.datatype :as dtype]
            [tech.v3.datatype.functional :as dfn]
            [tech.v3.dataset :as ds]
            [scicloj.tableplot.v1.plotly :as plotly]
            [scicloj.kindly.v4.kind :as kind]
            [clojure.data.json :as json]
            [tablecloth.time.api :as time-api]
            [tablecloth.time.column.api :as time-col])
  (:import [java.time LocalDate]))

2.1 — Loading data (tsibble equivalents)

In R, fpp3 provides datasets as tsibble objects with declared index and keys. In Clojure, we use plain tablecloth datasets loaded from CSV. The time column is just a column — no special metadata needed.

Helper: load fpp3 dataset

(defn load-fpp3
  "Load one of the fpp3 datasets from CSV."
  [name]
  (tc/dataset (str "data/fpp3/" name ".csv")))

PBS — Australian pharmaceutical benefit scheme

R: PBS (67,596 × 9, monthly, keyed by Concession/Type/ATC1/ATC2)

(def PBS (load-fpp3 "PBS"))
PBS

data/fpp3/PBS.csv [67596 9]:

MonthConcessionTypeATC1ATC1_descATC2ATC2_descScriptsCost
1991-07-01ConcessionalCo-paymentsAAlimentary tract and metabolismA01STOMATOLOGICAL PREPARATIONS18228.067877.00
1991-08-01ConcessionalCo-paymentsAAlimentary tract and metabolismA01STOMATOLOGICAL PREPARATIONS15327.057011.00
1991-09-01ConcessionalCo-paymentsAAlimentary tract and metabolismA01STOMATOLOGICAL PREPARATIONS14775.055020.00
1991-10-01ConcessionalCo-paymentsAAlimentary tract and metabolismA01STOMATOLOGICAL PREPARATIONS15380.057222.00
1991-11-01ConcessionalCo-paymentsAAlimentary tract and metabolismA01STOMATOLOGICAL PREPARATIONS14371.052120.00
1991-12-01ConcessionalCo-paymentsAAlimentary tract and metabolismA01STOMATOLOGICAL PREPARATIONS15028.054299.00
1992-01-01ConcessionalCo-paymentsAAlimentary tract and metabolismA01STOMATOLOGICAL PREPARATIONS11040.039753.00
1992-02-01ConcessionalCo-paymentsAAlimentary tract and metabolismA01STOMATOLOGICAL PREPARATIONS15165.054405.00
1992-03-01ConcessionalCo-paymentsAAlimentary tract and metabolismA01STOMATOLOGICAL PREPARATIONS16898.061108.00
1992-04-01ConcessionalCo-paymentsAAlimentary tract and metabolismA01STOMATOLOGICAL PREPARATIONS18141.065356.00
...........................
2007-08-01GeneralSafety netZZZ250.03030.15
2007-09-01GeneralSafety netZZZ281.03822.94
2007-10-01GeneralSafety netZZZ374.04346.34
2007-11-01GeneralSafety netZZZ501.06930.19
2007-12-01GeneralSafety netZZZ655.07939.00
2008-01-01GeneralSafety netZZZ797.09604.00
2008-02-01GeneralSafety netZZZ135.01591.00
2008-03-01GeneralSafety netZZZ15.0276.00
2008-04-01GeneralSafety netZZZ11.0165.00
2008-05-01GeneralSafety netZZZ21.0278.00
2008-06-01GeneralSafety netZZZ57.0491.00

The a10 pipeline

In R:

PBS |>
  filter(ATC2 == "A10") |>
  select(Month, Concession, Type, Cost) |>
  summarise(TotalC = sum(Cost)) |>
  mutate(Cost = TotalC / 1e6) -> a10
(def a10
  (-> PBS
      (tc/select-rows #(= "A10" (get % "ATC2")))
      (tc/select-columns ["Month" "Concession" "Type" "Cost"])
      (tc/group-by ["Month"])
      (tc/aggregate {"TotalC" #(dfn/sum (% "Cost"))})
      (tc/add-column "Cost" #(dfn// (% "TotalC") 1e6))))
a10

_unnamed [204 3]:

MonthTotalCCost
1991-07-013.52659100E+063.52659100
1991-08-013.18089100E+063.18089100
1991-09-013.25222100E+063.25222100
1991-10-013.61100300E+063.61100300
1991-11-013.56586900E+063.56586900
1991-12-014.30637100E+064.30637100
1992-01-015.08833500E+065.08833500
1992-02-012.81452000E+062.81452000
1992-03-012.98581100E+062.98581100
1992-04-013.20478000E+063.20478000
.........
2007-08-012.39302035E+0723.93020353
2007-09-012.29303569E+0722.93035694
2007-10-012.32633399E+0723.26333992
2007-11-012.52500302E+0725.25003022
2007-12-012.58060900E+0725.80609000
2008-01-012.96653560E+0729.66535600
2008-02-012.16542850E+0721.65428500
2008-03-011.82649450E+0718.26494500
2008-04-012.31076770E+0723.10767700
2008-05-012.29125100E+0722.91251000
2008-06-011.94317400E+0719.43174000

ansett — Ansett airlines weekly passenger data

(def ansett (load-fpp3 "ansett"))
ansett

data/fpp3/ansett.csv [7407 4]:

WeekAirportsClassPassengers
1989-07-10ADL-PERBusiness193.0
1989-07-17ADL-PERBusiness254.0
1989-07-24ADL-PERBusiness185.0
1989-07-31ADL-PERBusiness254.0
1989-08-07ADL-PERBusiness191.0
1989-08-14ADL-PERBusiness136.0
1989-08-21ADL-PERBusiness0.0
1989-08-28ADL-PERBusiness0.0
1989-09-04ADL-PERBusiness0.0
1989-09-11ADL-PERBusiness0.0
............
1992-09-07SYD-PERFirst176.0
1992-09-14SYD-PERFirst191.0
1992-09-21SYD-PERFirst204.0
1992-09-28SYD-PERFirst183.0
1992-10-05SYD-PERFirst220.0
1992-10-12SYD-PERFirst234.0
1992-10-19SYD-PERFirst203.0
1992-10-26SYD-PERFirst137.0
1992-11-02SYD-PERFirst161.0
1992-11-09SYD-PERFirst155.0
1992-11-16SYD-PERFirst188.0

aus_production — Australian quarterly production

(def aus-production (load-fpp3 "aus_production"))
aus-production

data/fpp3/aus_production.csv [218 7]:

QuarterBeerTobaccoBricksCementElectricityGas
1956-01-01284.05225.0189.0465.03923.05.0
1956-04-01213.05178.0204.0532.04436.06.0
1956-07-01227.05297.0208.0561.04806.07.0
1956-10-01308.05681.0197.0570.04418.06.0
1957-01-01262.05577.0187.0529.04339.05.0
1957-04-01228.05651.0214.0604.04811.07.0
1957-07-01236.05317.0227.0603.05259.07.0
1957-10-01320.06152.0222.0582.04735.06.0
1958-01-01272.05758.0199.0554.04608.05.0
1958-04-01233.05641.0229.0620.05196.07.0
.....................
2007-10-01473.02562.056411.0205.0
2008-01-01420.02183.059118.0194.0
2008-04-01390.02558.056660.0229.0
2008-07-01410.02612.064067.0249.0
2008-10-01488.02373.059045.0203.0
2009-01-01415.01963.058368.0196.0
2009-04-01398.02160.057471.0238.0
2009-07-01419.02325.058394.0252.0
2009-10-01488.02273.057336.0210.0
2010-01-01414.01904.058309.0205.0
2010-04-01374.02401.058041.0236.0

vic_elec — Victorian half-hourly electricity demand

Time column is "2011-12-31 13:00:00" — not auto-parsed by tablecloth. Use tc/convert-types with a format pattern to parse it.

(def vic-elec
  (-> (load-fpp3 "vic_elec")
      (tc/convert-types "Time" [:local-date-time "yyyy-MM-dd HH:mm:ss"])))
vic-elec

data/fpp3/vic_elec.csv [52608 5]:

TimeDemandTemperatureDateHoliday
2011-12-31T13:004382.82517421.402012-01-01True
2011-12-31T13:304263.36552621.052012-01-01True
2011-12-31T14:004048.96604620.702012-01-01True
2011-12-31T14:303877.56333020.552012-01-01True
2011-12-31T15:004036.22974620.402012-01-01True
2011-12-31T15:303865.59724420.252012-01-01True
2011-12-31T16:003694.09766420.102012-01-01True
2011-12-31T16:303561.62368619.602012-01-01True
2011-12-31T17:003433.03535219.102012-01-01True
2011-12-31T17:303359.46800018.952012-01-01True
...............
2014-12-31T07:304244.46553023.402014-12-31False
2014-12-31T08:004125.98825222.302014-12-31False
2014-12-31T08:304013.26284820.902014-12-31False
2014-12-31T09:003924.59343420.302014-12-31False
2014-12-31T09:303893.86797420.302014-12-31False
2014-12-31T10:003927.75308820.302014-12-31False
2014-12-31T10:303873.44871419.002014-12-31False
2014-12-31T11:003791.63732218.502014-12-31False
2014-12-31T11:303724.83566617.702014-12-31False
2014-12-31T12:003761.88685417.302014-12-31False
2014-12-31T12:303809.41458617.102014-12-31False

olympic_running — Olympic running times

(def olympic-running (load-fpp3 "olympic_running"))
olympic-running

data/fpp3/olympic_running.csv [312 5]:

column-0YearLengthSexTime
11896100men12.00
21900100men11.00
31904100men11.00
41908100men10.80
51912100men10.80
61916100men
71920100men10.80
81924100men10.60
91928100men10.80
101932100men10.30
...............
302200810000men1621.17
303201210000men1650.42
304201610000men1625.17
305198810000women1865.21
306199210000women1866.02
307199610000women1861.63
308200010000women1817.49
309200410000women1824.36
310200810000women1794.66
311201210000women1820.75
312201610000women1757.45

2.2 — Time plots

R: autoplot(melsyd_economy, Passengers) Clojure: tableplot line chart

Ansett airlines: Melbourne-Sydney economy class

(def melsyd-economy
  (-> ansett
      (tc/select-rows #(and (= "MEL-SYD" (get % "Airports"))
                            (= "Economy" (get % "Class"))))
      (tc/add-column "Passengers (000s)" #(dfn// (% "Passengers") 1000))))
(-> melsyd-economy
    (plotly/layer-line {:=x "Week"
                        :=y "Passengers (000s)"
                        :=title "Ansett airlines economy class: Melbourne-Sydney"}))

Notable features visible in the plot:

  • 1989: No passengers (industrial dispute)
  • 1992: Reduced load (economy seats replaced by business class trial)
  • Late 1991: Large increase in passenger load
  • Start of each year: Holiday-effect dips

Antidiabetic drug sales (a10)

(-> a10
    (plotly/layer-line {:=x "Month"
                        :=y "Cost"
                        :=title "Australian antidiabetic drug sales"
                        :=y-title "$ (millions)"}))

Clear increasing trend with strong seasonality that grows proportionally. The January spike each year is from stockpiling before year-end subsidies.

2.3 — Time series patterns

Three fundamental components:

  • Trend: long-term increase/decrease
  • Seasonal: fixed, known period (calendar-linked)
  • Cyclic: rises/falls of variable, non-fixed frequency

The four examples from Figure 2.3:

Australian quarterly beer production (seasonal + no trend)

(def recent-beer
  (-> aus-production
      (tc/select-rows #(>= (.getYear (get % "Quarter")) 2000))
      (tc/select-columns ["Quarter" "Beer"])))
(-> recent-beer
    (plotly/layer-line {:=x "Quarter"
                        :=y "Beer"
                        :=title "Australian quarterly beer production"}))

Google stock — daily changes (no pattern, random walk)

(def gafa (load-fpp3 "gafa_stock"))
(def google-2015
  (-> gafa
      (tc/select-rows #(and (= "GOOG" (get % "Symbol"))
                            (= (.getYear (get % "Date")) 2015)))))

Daily closing price

(-> google-2015
    (plotly/layer-line {:=x "Date"
                        :=y "Close"
                        :=title "Google daily closing stock price (2015)"}))

Daily change in closing price

(let [closes (vec (google-2015 "Close"))
      diffs (mapv - (rest closes) (butlast closes))]
  (-> (tc/dataset {"Date" (rest (vec (google-2015 "Date")))
                    "Change" diffs})
      (plotly/layer-line {:=x "Date"
                          :=y "Change"
                          :=title "Google daily change in closing stock price (2015)"})))

2.4 — Seasonal plots

R: gg_season(a10, Cost) — overlay each year on the same month axis. We extract year and month, then plot with color = year.

tablecloth.time has add-time-columns — a dataset-level operation that extracts datetime fields in one call. tablecloth auto-parses our CSV date strings into :packed-local-date, so these work out of the box.

Vector form: column names match field names Map form: explicit output names

(def a10-seasonal
  (-> a10
      (time-api/add-time-columns "Month" {:year "Year" :month "MonthNum"})))

Year is int64 — tableplot treats numeric columns as continuous color scales. Convert to string so it's treated as categorical (one line per year).

(-> a10-seasonal
    (tc/add-column "YearStr" #(mapv str (% "Year")))
    (plotly/layer-line {:=x "MonthNum"
                        :=y "Cost"
                        :=color "YearStr"
                        :=title "Seasonal plot: Antidiabetic drug sales"
                        :=x-title "Month"
                        :=y-title "$ (millions)"}))

Each line is one year. The seasonal shape is clear:

  • January spike (stockpiling)
  • Generally higher in second half of year
  • The whole pattern shifts upward year over year (trend)

Multiple seasonal periods — vic_elec

Electricity demand has daily, weekly, and yearly patterns. R: gg_season(Demand, period = "day"|"week"|"year")

Now that Time is parsed as LocalDateTime, we can use add-time-columns directly. The new computed fields handle fractional hours, phases, and string conversions. NOTE: The "Date" column is the billing/reporting date (next day), not the calendar date of the timestamp. We derive all groupings from the Time column.

(def vic-elec-with-fields
  (-> vic-elec
      (time-api/add-time-columns "Time" 
        {;; Basic fields
         :day-of-week "DayOfWeek"
         :day-of-year "DayOfYear"
         :week-of-year "WeekOfYear"
         :year "Year"
         ;; Computed fields for seasonal plots
         :hour-fractional "HourOfDay"
         :daily-phase "DailyPhase"
         :weekly-phase "WeeklyPhase"
         :week-of-year-index "WeekIndex"
         :date-string "TimeDate"
         :year-string "YearStr"
         :week-string "WeekLabel"
         :year-week-string "YearWeek"})))

Helper: seasonal-plot-spec

Generate a Plotly spec for seasonal plots using tableplot as the base. This uses tableplot's layer-line with :=color to generate multiple traces, then post-processes to hide legend and set custom colors.

(defn seasonal-plot-spec
  "Generate a Plotly spec for a seasonal plot.
   - ds: dataset (should include phase column)
   - phase-col: column for x-axis (phase within period, 0 to 1)
   - value-col: column for y-axis  
   - group-col: column to group by (creates one trace per unique value)
   - color-fn: fn from group-name (string) -> color string
   Options:
   - :line-width (default 0.3)
   - :title, :x-title, :y-title for axis labels"
  [ds phase-col value-col group-col color-fn 
   & {:keys [line-width title x-title y-title]
      :or {line-width 0.3}}]
  (let [;; Use tableplot to generate base spec with traces
        viz (-> ds
                (tc/order-by phase-col)
                (plotly/layer-line {:=x phase-col 
                                    :=y value-col 
                                    :=color group-col
                                    :=title title
                                    :=x-title x-title
                                    :=y-title y-title}))
        ;; Extract final Plotly spec using tableplot's official API
        spec (plotly/plot viz)]
    ;; Post-process traces: hide legend, set colors
    (update spec :data 
            #(mapv (fn [trace]
                     (-> trace
                         (assoc :showlegend false)
                         (assoc-in [:line :color] (color-fn (:name trace)))
                         (assoc-in [:line :width] line-width)))
                   %))))

Daily pattern: phase = hour/24, each day overlaid. Using seasonal-plot-spec helper with tableplot as base.

(let [year-color #(get {"2011" "#7570b3" "2012" "#1b9e77" "2013" "#d95f02" "2014" "#7570b3"}
                       (subs % 0 4) "gray")]
  (kind/plotly
    (seasonal-plot-spec vic-elec-with-fields
                        "DailyPhase" "Demand" "TimeDate"
                        year-color
                        :title "Electricity demand: Victoria (daily pattern)"
                        :x-title "Phase of day (0=midnight, 0.5=noon)"
                        :y-title "MWh")))

Weekly pattern: phase = hours_since_monday / 168, each week overlaid. Using seasonal-plot-spec helper with tableplot as base.

(let [year-color #(get {"2011" "#7570b3" "2012" "#1b9e77" "2013" "#d95f02" "2014" "#7570b3"}
                       (subs % 0 4) "gray")]
  (kind/plotly
    (seasonal-plot-spec vic-elec-with-fields
                        "WeeklyPhase" "Demand" "YearWeek"
                        year-color
                        :title "Electricity demand: Victoria (weekly pattern)"
                        :x-title "Phase of week (0=Mon, 0.5=Thu noon, 1=Sun midnight)"
                        :y-title "MWh")))

Yearly pattern: x = day of year, each year is a line. All 3 years fit — only 3 traces.

(-> vic-elec-with-fields
    (plotly/layer-line {:=x "DayOfYear"
                        :=y "Demand"
                        :=color "YearStr"
                        :=title "Electricity demand: Victoria (yearly pattern)"
                        :=x-title "Day of year"
                        :=y-title "MWh"}))

2.5 — Seasonal subseries plots

R: gg_subseries(a10, Cost) — for each month, show values across years with the mean as a horizontal line.

Group by month, plot each month's values over years

(def a10-subseries
  (-> a10
      (time-api/add-time-columns "Month" {:year "Year" :month "MonthNum"})))

Faceted by month — each panel shows that month across all years. Convert MonthNum to string so tableplot treats it as categorical.

(-> a10-subseries
    (tc/add-column "MonthLabel" #(mapv str (% "MonthNum")))
    (plotly/layer-line {:=x "Year"
                        :=y "Cost"
                        :=color "MonthLabel"
                        :=title "Seasonal subseries plot: Antidiabetic drug sales"
                        :=y-title "$ (millions)"}))

2.6 — Scatterplots

R: ggplot(aes(x = Temperature, y = Demand)) + geom_point() Electricity demand vs temperature for 2014 Victoria.

(def vic-elec-2014
  (-> vic-elec
      (tc/select-rows #(= (.getYear (get % "Time")) 2014))))
(-> vic-elec-2014
    (plotly/layer-point {:=x "Temperature"
                         :=y "Demand"
                         :=title "Electricity demand vs temperature (Victoria, 2014)"
                         :=x-title "Temperature (°C)"
                         :=y-title "Demand (MWh)"}))

The U-shape: high demand for both cold (heating) and hot (air conditioning). Correlation coefficient r = 0.28 — misleading for non-linear relationships. Always plot first!

Scatterplot matrix — Australian tourism by state

(def tourism (load-fpp3 "tourism"))
(def visitors-by-state
  (-> tourism
      (tc/group-by ["Quarter" "State"])
      (tc/aggregate {"Trips" #(dfn/sum (% "Trips"))})
      ;; Pivot wider: one column per state
      (tc/pivot->wider "State" "Trips")))
visitors-by-state

_unnamed [80 9]:

QuarterSouth AustraliaNorthern TerritoryWestern AustraliaVictoriaNew South WalesQueenslandACTTasmania
1998-01-011735.4384181181.44882341641.08949456010.42449058039.79479544041.3701591551.0019215981.6291663
1998-04-011394.6383194313.93615151576.32653374795.24675527166.01380533967.9046526416.0256229693.2882267
1998-07-011213.3307229528.43685921588.29369224316.84516976747.93578964593.8939910436.0290111401.8752750
1998-10-011452.5699692247.70281731839.71699044674.82911777282.08237124202.8291407449.7984449680.6010392
1999-01-011541.1817910184.88959231835.68757295304.33419547584.77683894332.4908503378.5728168925.4197220
1999-04-011636.1154606366.09278581836.96429034561.71090097054.03870514824.4804518558.1781421620.7925480
1999-07-011282.9490168501.43128411725.37486813783.60103936723.67714615018.0345042448.9011959430.2234538
1999-10-011386.9430724248.43029961518.94269724201.42247797135.76694544349.8358766594.8254416591.7588297
2000-01-011832.8466955206.05322001635.52501815566.85728767295.98663874413.2262339599.6685405789.1311444
2000-04-011415.4842932359.76192461808.32861604501.90428226445.22016774344.1592770557.1350565509.0698648
...........................
2015-04-011561.2069129533.02541612394.37403645284.47087757375.20853235555.8301970516.8703428577.9280518
2015-07-011383.9080928682.55182372467.06424634981.16882946961.25020335866.0704509688.2031884455.5288199
2015-10-011700.3825231423.20251642743.44906485550.82370347760.01902215532.2106436597.2455694832.8281787
2016-01-011871.4572503352.09849582871.17187476599.70048308287.87208045042.1537503625.14163441011.0421206
2016-04-011660.1266358469.13653562469.39428095335.22973697660.57962065446.5536505592.6084987651.3987982
2016-07-011534.2342547713.82074992317.04454375221.88050977326.30467075939.3346446572.4370928566.2636651
2016-10-011635.4677141340.69592742656.33070066113.01641297995.50219206106.3838529667.2141410832.9900323
2017-01-011815.4121833298.16602642570.91168927269.52701368320.70346235451.9871280634.36874691135.3127709
2017-04-011660.1320376621.44139222438.48793875901.38654738285.02627955638.4735354748.2904309820.3685463
2017-07-011514.7861308597.65839692493.95499955817.97154378298.25741736533.8369322631.7599043618.0893828
2017-10-011869.1069854346.06193842635.75429616865.39885118542.49060735813.9036668720.3293701800.5084986

For a scatterplot matrix, we'd plot each state column against every other. Tableplot doesn't have a built-in pairs plot, but you can compose them. Here's one pair as an example — NSW vs Victoria:

(-> visitors-by-state
    (plotly/layer-point {:=x "New South Wales"
                         :=y "Victoria"
                         :=title "Tourism: NSW vs Victoria (quarterly trips)"}))

2.7 — Lag plots

R: gg_lag(Beer, geom = "point") Plot y_t against y_{t-k} for various lags. This is the visual precursor to autocorrelation.

For beer production, lag 4 should show strong positive correlation (seasonal) Using tablecloth.time.api/add-lags with auto-drop of missing values: Note: add-lags creates keyword columns like :Beer_lag4

(-> recent-beer
    (time-api/add-lags "Beer" [4])
    (plotly/layer-point {:=x "Beer_lag4"
                         :=y "Beer"
                         :=title "Lag 4 plot: Australian beer production"
                         :=x-title "Beer (t-4)"
                         :=y-title "Beer (t)"}))

Strong positive diagonal = strong correlation at lag 4 (Q4 peaks align with Q4 peaks from the previous year)

2.8 — Autocorrelation (ACF)

r_k = Σ(y_t - ȳ)(y_{t-k} - ȳ) / Σ(y_t - ȳ)²

This is a core function we need in tablecloth.time. For now, let's compute it manually.

(defn acf
  "Compute autocorrelation coefficients for lags 1..max-lag.
  Returns a dataset with :lag and :acf columns."
  [values max-lag]
  (let [values (double-array (remove nil? values))
        n (alength values)
        mean (/ (areduce values i sum 0.0 (+ sum (aget values i))) n)
        ;; denominator: Σ(y_t - ȳ)²
        denom (areduce values i sum 0.0
                       (let [d (- (aget values i) mean)]
                         (+ sum (* d d))))
        lags (range 1 (inc max-lag))
        acf-vals (mapv (fn [k]
                         (let [numer (loop [t k, sum 0.0]
                                      (if (>= t n)
                                        sum
                                        (recur (inc t)
                                               (+ sum (* (- (aget values t) mean)
                                                        (- (aget values (- t k)) mean))))))]
                           (/ numer denom)))
                       lags)]
    (tc/dataset {"lag" (vec lags)
                 "acf" acf-vals})))

ACF of beer production

(def beer-acf (acf (recent-beer "Beer") 9))
beer-acf

_unnamed [9 2]:

lagacf
1-0.05298108
2-0.75817544
3-0.02623376
40.80220453
5-0.07747120
6-0.65745127
70.00119492
80.70725408
9-0.08875626

Should match R output: lag 1: -0.053, lag 2: -0.758, lag 4: 0.802, lag 8: 0.707

(let [T (count (remove nil? (vec (recent-beer "Beer"))))
      bound (/ 1.96 (Math/sqrt T))]
  (-> beer-acf
      (tc/add-column "upper" (repeat (tc/row-count beer-acf) bound))
      (tc/add-column "lower" (repeat (tc/row-count beer-acf) (- bound)))
      (plotly/layer-bar {:=x "lag"
                         :=y "acf"
                         :=title "ACF: Australian beer production"
                         :=y-title "Autocorrelation"})))
  • r₄ is highest (seasonal: peaks 4 quarters apart)
  • r₂ is most negative (peaks vs troughs, 2 quarters apart)
  • Dashed lines at ±1.96/√T mark significance bounds

ACF of antidiabetic drug sales (trend + seasonality)

(def a10-acf (acf (a10 "Cost") 48))
(-> a10-acf
    (plotly/layer-bar {:=x "lag"
                       :=y "acf"
                       :=title "ACF: Australian antidiabetic drug sales"}))

Slow decay (trend) + scalloped shape (seasonality at lag 12, 24, 36...)

2.9 — White noise

A white noise series has no autocorrelation. All ACF spikes should fall within ±1.96/√T.

(def white-noise
  (tc/dataset {"t" (range 1 51)
               "wn" (repeatedly 50 #(let [u1 (rand) u2 (rand)]
                                      (* (Math/sqrt (* -2 (Math/log u1)))
                                         (Math/cos (* 2 Math/PI u2)))))}))
(-> white-noise
    (plotly/layer-line {:=x "t"
                        :=y "wn"
                        :=title "White noise"}))
(def wn-acf (acf (white-noise "wn") 15))
(let [bound (/ 1.96 (Math/sqrt 50))]
  (-> wn-acf
      (tc/add-column "upper" (repeat (tc/row-count wn-acf) bound))
      (tc/add-column "lower" (repeat (tc/row-count wn-acf) (- bound)))
      (plotly/layer-bar {:=x "lag"
                         :=y "acf"
                         :=title "ACF: White noise"})))

All spikes should be within ±0.28 (= 1.96/√50) → Confirms: no signal to model.

Appendix: Benchmarking seasonal plot approaches

Two approaches to building seasonal plots with many traces:

  1. tableplot + kindly/f post-processing: Let tableplot build traces, extract via :kindly/f, then post-process each trace
  2. Pure manual traces: Build Plotly traces directly with reduce

Let's time them:

(defn seasonal-plot-manual
  "Build seasonal Plotly spec manually (no tableplot)."
  [ds phase-col value-col group-col color-fn
   & {:keys [line-width title x-title y-title]
      :or {line-width 0.3}}]
  (let [groups (-> ds (tc/group-by group-col) :data)
        traces (mapv (fn [group-ds]
                       (let [group-name (first (group-ds group-col))
                             sorted-ds (tc/order-by group-ds phase-col)]
                         {:x (vec (sorted-ds phase-col))
                          :y (vec (sorted-ds value-col))
                          :type "scatter"
                          :mode "lines"
                          :name group-name
                          :showlegend false
                          :line {:color (color-fn group-name)
                                 :width line-width}}))
                     groups)]
    {:data traces
     :layout {:title title
              :xaxis {:title x-title}
              :yaxis {:title y-title}}}))

Benchmark: tableplot+plot vs manual

Using the daily seasonal plot (700+ days = 700+ traces)

(let [color-fn #(get {"2011" "#7570b3" "2012" "#1b9e77" "2013" "#d95f02" "2014" "#7570b3"}
                     (subs % 0 4) "gray")
      n 10]
  {:tableplot+post-process
   (let [start (System/nanoTime)]
     (dotimes [_ n]
       (seasonal-plot-spec vic-elec-with-fields "DailyPhase" "Demand" "TimeDate" color-fn))
     (/ (- (System/nanoTime) start) 1e6 n))
   
   :manual-traces
   (let [start (System/nanoTime)]
     (dotimes [_ n]
       (seasonal-plot-manual vic-elec-with-fields "DailyPhase" "Demand" "TimeDate" color-fn))
     (/ (- (System/nanoTime) start) 1e6 n))})
{:tableplot+post-process 1243.6000682, :manual-traces 260.545091}

Understanding kindly/f and plotly/plot

Kindly is a portable notation protocol for Clojure visualizations. When tableplot builds a plot, it returns a "recipe" map like:

{:kindly/f #'plotly-xform        ;; transform function
 :data :=traces                  ;; placeholder
 ::ht/defaults {:=x "col" ...}}  ;; our bindings + dataset

The :kindly/f function transforms the recipe into actual Plotly JSON. This defers evaluation — Clay/Portal call it when rendering.

To get the raw spec for post-processing, use plotly/plot:

(let [viz (plotly/layer-line ...)
      spec (plotly/plot viz)]    ;; official API to force evaluation
  (update spec :data ...))       ;; now we can modify traces

Why deferred execution?

  • Lazy composition (chain layer-* calls before computing)
  • Tool flexibility (Clay, Portal render differently)
  • Introspection (inspect the recipe without triggering evaluation)

Summary

R (fpp3)Clojure (tablecloth + tableplot)
autoplot()plotly/layer-line
gg_season()extract year/month + layer-line with :=color
gg_subseries()extract month + faceted/colored line plot
gg_lag()manual lag column + layer-point
ACF()acf function (to be added to tablecloth.time)
ggplot + geom_*plotly/layer-point, plotly/layer-bar, etc.

New function for tablecloth.time: acf — autocorrelation computation. This should live in the column API alongside the field extractors.

source: notebooks/chapter_02_time_series_graphics.clj